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Abstract 

Based on observations of points uniformly distributed over a con¬ 
vex set in M d , a new estimator for the volume of the convex set is 
proposed. The estimator is minimax optimal and also efficient non- 
asymptotically: it is nearly unbiased with minimal variance among all 
unbiased oracle-type estimators. Our approach is based on a Poisson 
point process model and as an ingredient, we prove that the convex hull 
is a sufficient and complete statistic. No hypotheses on the boundary 
of the convex set are imposed. In a numerical study, we show that the 
estimator outperforms earlier estimators for the volume. In addition, 
an adjusted set estimator for the convex body itself is proposed. 
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1 Introduction 


Driven by applications in image analysis and signal processing, the estimation 
of the support of a density attracts a lot of statistical activity. In many cases 
it is natural to assume a convex shape for the support set. First fundamental 
results for convex support estimation have been achieved by Korostelev and 
Tsybakov (1993, 1994) who prove minimax-optimal rates of convergence in 
Hausdorff distance for a set estimator. In particular, Korostelev and Tsy¬ 
bakov (1993) prove that the convex hull of the points C n , which is a max¬ 
imum likelihood estimator for the set C , is rate-optimal. Interestingly, the 
volume | C n [ of the convex hull is not rate-optimal for estimation of the vol¬ 
ume \C\ of the convex set and an alternative two-step estimator, optimal 
up to a logarithmic factor, was proposed. A fully rate-optimal estimator for 
the volume of a convex set with smooth boundary was then constructed by 
Gayraud (1997) based on three-fold sample splitting. For various extensions 
and applications of convex support estimation, let us refer to Mammen and 
Tsybakov (1995); Guntuboyina (2012); Brunei (2014) and the literature cited 
there. Related ideas under Holder and monotonicity constraints, respectively, 
have been adopted by Reifi and Selk (2015) for a one-sided regression model. 

Our contribution is the construction of a very simple volume estimator 
which is not only rate-optimal over all convex sets without boundary restric¬ 
tions, but even adaptive in the sense that it attains almost the parametric 
rate if the convex set is a polytope. Our approach is non-asymptotic and 
provides much more precise properties. The analysis is based on a Poisson 
point process (PPP) observation model with intensity A > 0 on the convex 
set C CR d . We thus observe 

Ad, ...,X N Li ~ U(C), N ~ Poiss(A|C|), 

where (X n ),N are independent, see Section 2 below for a concise intro¬ 
duction to the PPP model. Using Poissonisation and de-Poissonisation tech¬ 
niques, this model exhibits asymptotic properties like the uniform model, 
i.e. a sample of n = A|C| uniformly on C distributed random variables 
X \,..., X n . The beautiful geometry of the PPP model, however, allows for 
much more concise ideas and proofs, see also Meister and Reifi (2013) for 
connections between PPP and regression models with irregular error distri¬ 
butions. From an applied perspective, PPP models are often natural, e.g. for 
spatial count data of photons or other emissions. 
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For known intensity A of the PPP, we construct in Section 3 an oracle 
estimator ■ d orac ie■ Theorem 3.2 shows that this estimator is UMVU (uni¬ 
formly of minimum variance among unbiased estimators) and rate-optimal. 
To this end, moment bounds from stochastic geometry for the missing vol¬ 
ume of the convex hull, obtained by Barany and Larrnan (1988) and Dwyer 
(1988) are essential. Moreover, we derive results of independent interest: the 
convex hull C = conv{AR,..., AOv} forms a sufficient and complete statistic 
(Proposition 3.5) and the Poisson point process, conditionally on C , remains 
Poisson within its convex hull (Theorem 3.1). 

For the more realistic case of unknown intensity A, we analyse in Section 4 
our final estimator 


^ def N + 1 
V = - 

N 0 + l 



where N 0 denotes the number of observed points in the interior of C. We are 
able to prove a sharp oracle inequality, comparing the risk of this estimator 
to that of $ oracle■ Here, very recent and advanced results by Reitzner (2003); 
Pardon (2011); Beermann and Reitzner (2015) on the variance of the number 
of points Ng on the boundary of C and the missing volume \C \ C\ are of 
key importance. This fascinating interplay between stochastic geometry and 
statistics prevails throughout the work. 

The lower bound showing that d is indeed minimax-optimal is proved 
in Theorem 3.4 by adopting the proof of the lower bound in the uniform 
model by Gayraud (1997). A small simulation study is presented in Section 5. 
Moreover, we propose to enlarge the convex hull set by the factor {{N + 
1)/ (N 0 + l))H d and we study its error as an estimator of the set C itself. The 
proof of Lemma 4.1 is deferred to the Appendix. 


2 Digression on Poisson Point Processes 

Most of the results and notation are adapted from Karr (1991). We fix a 
compact convex set E in with non-empty interior as a state space and 
denote by £ its Borel a -algebra. We define the family of convex subsets 
C = {C C E, convex, closed} (this implies that all sets in C are compact) 
and the family of compact subsets K = (A" C E, compact} . ft is natural 
to equip the space C (resp. K ) with the Hausdorff-metric d H and its Borel 
a-algebra 23c (resp. 25k)- Then (C ,dn) is a compact and thus separable 
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space and the mapping (aq,..., Xk) | —> conv{;ri,..., Xk}, which generates the 
convex hull of points Xi E E, is continuous from E fc to (C, dn)- 

On (E, £) we define the set of point measures M = {m measure on £ : 
m(A) E N, VOL E £} equipped with the a-algebra A4 = a(m t->- m(A), A E 
£). Let U+ (E) be the collection of continuous functions E i —> [0, oo) with 
compact support. A useful topology for M is the vague topology which makes 
M a complete, separable metric space, cf. Section 3.4 in Resnick (2013). A se¬ 
quence of point measures m n E M then converges vaguely to a limit m E M 
if and only if m n [f] —> m[f] for all / E (7+(E) where m[f] = f E f dm. Let 
(il, J 7 , P) be an abstract probability space. We call a measurable mapping 
Af : i? —* M a Poisson point process (PPP) of intensity A > 0 on C E C if 

• for any A E £, we have Af(A) ~ Poiss(A|A D C |) , where |A D C\ 
denotes the Lebesgue measure of A D C\ 

• for all mutually disjoint sets Ai,...,A n E £, the random variables 
Af(Ai), ...,Af(A n ) are independent. 

For statistical inference, we assume the Poisson point process to be defined on 
a set of non zero Lebesgue measure, i.e. |Cj > 0 . A more constructive and in¬ 
tuitive representation of the PPP Af is AT = Ylf=i for N ~ Poiss(A|C|) 
and i.i.d. random variables (AA) , independent of N and distributed uni¬ 
formly P (Xi E A) = \A fl C|/|C|, so that Af(A) = JAili 1 (Aj E A) for any 
Ae£. 

We consider the convex hull of the PPP points C : M —)• C defined by 
C{Af) := conv{A|,..., X N } , which by the above continuity property of the 
convex hull is a random element with values in the Polish space (C, d #), see 
also Davis, Mulrow, and Resnick (1987) for a detailed study of the continuity 
of the convex hull. For a short notation, we shall further write C to denote 
the convex hull of the process Af. In the sequel, conditional expectations and 
probabilities with respect to C are thus well defined. We can also evaluate 
the probability 



for A E 23c • Usually, we only write the subscript C or sometimes (U, A) 
when different probability distributions are considered simultaneously. The 
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likelihood function for C £ C and A, A 0 > 0 is then given by 

^±(X U ...,X N ) = e A °l E l— A l c 'l ( A/ A 0 ) JV 1(V i = 1 X t £ C) 

dr e,A 0 

= e Ao|E| - A l c 'l(A/Ao ) N 1(C C C ), (2.1) 

cf. Thm. 1.3 in Kutoyants (1998). For the last line, we have used that a point 
set is in C if and only if its convex hull is contained in C. 

For the set-indexed process (A f(K),K £ K) we define its natural set- 
indexed filtration 


T k = a({Af(U)\ U C K, U £ K}) 

for any K £ K. The filtration (Tk, K £ K) possesses the following prop¬ 
erties: 


• monotonicity: TV, C Tk 2 for any K\, K 2 £ K with K\ C K 2 , 

• continuity from above: Tk = X*^\Tk, if K t f K ; 

cf. Zuyev (1999). By construction, the restriction A/~k = J\f(-r\K) of 
the point process A f onto K £ K is Tk -measurable (in fact, Tk = 
<7({A/x(f7); U £ K})). In addition, it can be easily seen that Mk is a Pois¬ 
son point process in M, cf. the Restriction Theorem in Kingman (1992), 
and thus C(N"k) = conv({Ai,..., Xn} fl K) is by the above arguments 
Tk -measurable. 

A random compact set /C is a measurable mapping /C : (M,A4) —> 
(K, 55 k ) • Note that Zuyev (1999) defines a random compact set as a mea¬ 
surable mapping from (M,Ad) to (K.ctk) where <tk is the so-called Effros 
a -algebra generated by the sets {F £ K : F fl K ^ 0} , K £ K . Thanks to 
Thm. 2.7 in Molchanov (2006), the Effros a-algebra <tk induced on the fam¬ 
ily of compact sets K coincides with the Borel a -algebra 23 k , and we prefer 
to stick to the first definition of a random compact set for convenience. Next, 
we recall the definition of stopping sets from Rozanov (1982) in complete 
analogy with stopping times. 

Definition 2.1. A random compact set /C is called an Tk -stopping set if 
{K. C K} £ T k for all K £ K. The sigma-algebra of K -history is defined 
as T k = {A £ £ : An{X C K} £ T K VJl £ K} ? where $ = a{T K ; K £ K). 
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For a set A C E let A c denote its complement. 

^ def ^ 

Lemma 2.2. The set /C = C c , the closure of the complement of the convex 
hull, is an (Tk)- stopping set. 

Proof. We claim /C C K if and only if K c C conv({-Xi,..., Wv} fl K). 
Indeed, if /C C K holds, then the boundary dC = dK, is in K which 
implies conv({Ab,..., Xn} fl K) = C. Consequently, K c C /C c C C = 
conv({Ad,..., Xn} fl K ) holds. Conversely, K c C conv({Xi,..., Xjy} fl K) 
implies immediately K c C C and thus C c C K. Since K is closed, we obtain 
JC C K. 

Since {Xi ,..., X N }P\K are the realisations of the point process inside K 
and the convex hull is measurable, we conclude {K c C conv({Xi,..., X^} fl 
K)}eXK. □ 

We shall further use the following short notation: N = M{C) denotes 
the total number of points, N 0 = N(C°) the number of points in the interior 
of the convex hull C and Ng = A f(dC) = Af(dlC) the number of points on 
the boundary of the convex hull. For asymptotic bounds we write f(x) = 
0(g(x)) or ffx) < g(x) if f(x) is bounded by a constant multiple of g(x) 
and f(x) ~ g(x) if f(x) < g{x) as well as g(x) < f(x ). 

3 Oracle case: intensity A is known 

For a PPP on C G C with intensity A > 0, we know N ~ Poiss(A|C|). In the 
oracle case, when the intensity A is known, N/X estimates \C\ without bias 
and yields the classical parametric rate in A: 

E[(N/X - IC'D 2 ] = A -2 Var(iV) = ® . 

A 

Another natural idea might be to use the plug-in estimator \C\ whose error 
is given by the missing volume and satisfies 

E[(|C| - IC'D 2 ] = E [|C \ C I 2 ] = 0(|C| 2(d - 1)/(d+1) A- 4/(d+1) ), 

where the bound is obtained similarly to (3.4) and (3.5) below. This means 
that its error is of smaller order than A -1 for d C 2, but larger for d ^ 4. 
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For any d ^ 2, however, both convergence rates are worse than the minimax- 
optimal rate / \-( d + 3 )/( d + 1 ) j established below. 

The way to improve these estimators is to observe that by the likelihood 
representation (2.1) for A = Ao and the Neyrnan factorisation criterion the 
convex hull is a sufficient statistic. Consequently, by the Rao-Blackwell the¬ 
orem, the conditional expectation of N/X given the convex hull C is an 
estimator with smaller mean squared error (MSE). 

The number of points N can be split into the number Ng of points on 
the boundary and the number N a of points in the interior of the convex hull. 
The following theorem is essential in deriving the oracle estimator. Although 
the statement of the theorem is quite intuitive and already used in Privault 
(2012), the proof turns out to be nontrivial and is deferred to the Appendix. 

Theorem 3.1. The number Ng of points on the boundary of the convex 
hull is measurable with respect to the sigma-algebra of 1C -history Pf . The 
number of points in the interior of the convex hull N a is, conditionally on 
, Poisson-distributed: 


N 0 1 Jjg ~ Poiss{ A 0 ) with A 0 = A|C|. (3.1) 


In addition, we have J-)c = a(C), where the latter is the sigma-algebra 
cr({C C B, B G C}) completed with the null sets in $. 


With Theorem 3.1 at hand, we obtain the oracle estimator 


oracle = E 


j\ e '. 


= E 


No + Ng 
A 


c 


= \c\ + T' 


(3.2) 


where conditioning on C means conditioning on cr(C) = . 

Theorem 3.2. For known intensity A > 0, the oracle estimator d ora cie is 
unbiased and of minimal variance among all unbiased estimators (UMVU). 
It satisfies 

Var (£ orode ) = iE[|C7\C|]. 

Its worst case mean squared error over C decays as A f oo like \~(d+3,)/{d+i) 
in dimension d: 


limsup aC+ 3 )/C +1 ) sup {|C|- (d - 1)/(d+1) E[(d orade - IC'D 2 ] } < oo . 

A—>oo CeC,|C|>0 J 
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Remark 3.3. The theorem implies that the rate of convergence for the RMSE 
(root mean-squared error) of the estimator Doracie is \~( d + 3 )/( 2d + 2 ) . j n The¬ 
orem 3-4 below, we prove that the lower bound on the minimax risk in the 
PPP model is of the same order implying that the rate is minimax-optimal. 
Even more, the oracle estimator is adaptive in the sense, that its rate is 
faster if the missing volume decays faster. In particular, for polytopes C it is 
shown in Barany and Larman (1988) and independently in Dwyer (1988) that 
E[|C\C|] A 1 (log(A|C|)) d x , which implies a faster (almost parametric) 
rate of convergence for the RMSE of the oracle estimator. 

Proof. The unbiasedness follows immediately from the definition (3.2). By 
the law of total variance, we obtain 


/ N 

Var (doracle) = Var (^y 

\c\ ^ 

= — -E 
A 


N , - 
E Var ( — | C 


ICI ^ 
'—- — E 
A 


Var 


t!" 


rV 

A 2 


= -E||C\C|]. 


(3.3) 


Proposition 3.5 below affirms that the convex hull C is not only a sufficient, 
but also a complete statistic such that by the Lehmann-Scheffe theorem, the 
estimator 'd orac ie has the UMVU property. 

Finally, we bound the expectation of the missing volume | C \ C\ by 
Poissonisation, i.e. using that the convex hull C in the PPP model con¬ 
ditionally on the event {N = k} is distributed as the convex hull Ck = 
conv{Xi,..., Xf..} in the model with k uniform observations on C, for which 
the following upper bound is known (e.g., Barany and Larman (1988)): 


sup E 
Cec,|C|>o 


\\C\C k h 
[ \C\ \ 


o(k - 2/ ( d+1) ). 


(3.4) 


Thus, it follows by a Poisson moment bound 


sup E 
Cec,|C|>o 


r \c\c\ 1 

_| < 7|(d-i)/(d+i)_ 


sup ) 
cec,|c|>o 


e-A|C|( A |C^ 

\ C \-2/( d +l) k \ 


r |C , \C fc | -i 
[ \C\ \ 


0( A~ 2/(d+1) ). 


(3.5) 


This bound, together with (3.3), yields the assertion. 


□ 


















The lower bound for the risk in the PPP framework can be derived from 
the lower bound in the uniform model with a fixed number of observations, 
see Thm. 6 in Gayraud (1997). 

Theorem 3.4. For estimating \C\ in the PPP model with parameter class 
C , the following asymptotic lower bound holds 

lim inf / \G+ 3 )/( d + 1 ) in f sup E c [(|C| - d A ) 2 ] > 0 , (3.6) 

A -> 00 d x Cec 

where the infimum extends over all estimators in the PPP model with 
intensity A . 

Proof. We use that an estimator id\ in the PPP model is an estimator in the 
uniform model on the event {N = n} . Then, due to the lower bound in the 
uniform model in Gayraud (1997), for a constant c > 0 and for all n G N 
there exists a set C n G C with \C n \ ~ 1 such that for all k = 1, ...,n, 

E Cn [(!<?„! - £a) 2 | N = k]> cn- (d+3)/(d+1) , a.s. 

Then, in the PPP model for C = CpAj with A|C| ^ 1, we have 

Ec[(|e| - tf A ) 2 ] = ]Te c [(|C , | - ^ A ) 2 \N = k] F(N = k ) 

fce N 

^ 5^ E c [(\C\ - d x ) 2 | N = k] F(N = k ) 

K L A J 

> c [AJ' (d+3)/(,i+1) (1 - P(1V > [AJ)) 

~ A _ ( d+3 )/ (d+1 ) 

applying Chernoff’s inequality to IV ~ Poiss(A|C|) for the last line. Thus, 
the lower bound (3.6) follows. □ 

Proposition 3.5. For known intensity A > 0, the convex hull C = 
conv{Xi ,..., X N } is a complete statistic. 

Proof. We need to show the implication 

VG e C : E c [T(C)] = 0 =>► T(d) = 0 P E - a.s. 
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for any 23c -measurable function T : C —> M. From the likelihood in (2.1) 
for A = Aq, we derive 


Ec[T(C)] 


Eg 


T(C) exp(A|E \ C|)l(C C C) 


Since exp(A|E \ C\) is deterministic, Ec[T(C)] = 0 for all C G C implies 
VC g C : E e [T(C)1(C c C)] = 0. 

For C G C, dehne the family of convex subsets of C as [C] ={Ae C\A C 
C} such that C C C •<=>• C G [C]. Splitting T = T + — T~ with non¬ 
negative 23c -measurable functions T + and T~ , we infer that the measures 
^(B) = E e [T ± (C)1(C G B )}, B G 23 c , agree on {[C] | C G C} . 

Note that the brackets {[C]|C G C} are D -stable due to [A] D [C] = 
[A D C] and A fl C G C. If the cr-algebra C generated by {[C] | C G C} 
contains *Bc, the uniqueness theorem asserts that the measures /i + , p - agree 
on all Borel sets in 23c , hi particular on {T > 0} and {T < 0} , which 
entails Ee[T + (C)] = Ee[T”(C)] = 0. Thus, in this case, T(C) = 0 holds 
PE-a.s. 

It remains to show that C = a([C], C G C) equals the Borel a-algebra 
IBc • This can be derived as a non-trivial consequence of Choquet’s theorem, 
see Thm. 7.8 in Molchanov (2006), but we propose a short self-contained 
proof here. Let us dehne the family (C) = {B G C|C C B} of convex sets 
containing C. Then the closed Hausdorff ball with center C and radius e > 0 
has the representation 


B e (C) = {A g C | d H (A, CKe} = {AGC| C_ £ (C) CACU £ (C)}, 


with U S (C) = {x G E | dist(x, C) ^ e}, U- e (C) = {x G C | dist(x. E \ C) ^ 
e}. Noting that U £ (C), C_ £ (C) are closed and convex and thus in C, we obtain 

B'(C) = <f/_«(C)> n [C/«(C)]. 

Since (C, du) is separable, our problem is reduced to proving that all angle 
sets (C) for C G C are in C. A further reduction is achieved by noting 
(C) = flxec^) = flxecnQ^) se t tin g { x ) = ({^}) f° r short such that it 
suffices to prove (x) G C for all x G E. 

Now, let x G E and C G C such that x ^ C. Then, by the Hahn- 
Banach theorem, there are S > 0,u G such that (v,c — x) ^ 5 holds 
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Figure 1: The construction used in the proof. 


for all c G C . By a density argument, we may choose <5 G Q + and v € Q d . 
Denoting the corresponding hyperplane intersected with E by H$ v = {£ G 
E | (v, £ — x) ^ h}, see Figure 1, we conclude 

w c = U U [#«.»] 6 c ■ 

5eQ+ veQ d '^£~' 

Consequently, (x) G C and thus 23 c Q C hold. □ 

4 Unknown intensity A: nearly unbiased es¬ 
timation 

In case the intensity A is unknown and the oracle estimator 'd orac i e hi (3.2) 
is inaccessible, the maximum-likelihood approach suggests to use N/\C\ as 
an estimator for A in (2.1). This yields the plug-in estimator for the volume, 

$ plugin = \C\ + f|C| . 

In the unlikely event N = \C\ = 0, we define fipiugin = 0. This estimator has 
a significant bias due to the following result, which is proved in the appendix. 

Lemma 4.1. For the bias of the plug-in MLE estimator $ p i ug in, it follows 
with some universal constant c > 0 


\c\ - nd plugin ] > cE[\C \ C|] 2 , VC G C . (4.1) 
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The maximal bias over C G C is thus at least of order A _4//( - d+1 \ which 
is worse than the minimax rate \-(. d + 3 )/( 2d + 2 ) for d > 5. Yet, in the two- 
dimensional finite sample study of Section 5 below, its performance is quite 
convincing. We surmise that idpiugin is rate-optimal for d ^ 5, but we leave 
that question aside because the final estimator we propose will be nearly 
unbiased and will satisfy an exact oracle inequality. In particular, it is rate- 
optimal in any dimension. The new idea is to exploit that the number of 
interior points of C satisfies N 0 | C ~ Poiss(A 0 ), see (3.1). 

Remark 4.2. There is no conditionally unbiased estimator for A" 1 based 
on observing N 0 | C ~ Poiss( A c ) for A 0 ranging over some open (non-empty) 
interval. Otherwise, an estimator Ji(N 0 ) for Xf 1 would satisfy E[/I(./V 0 )|C'] = 
A" 1 implying 


A 


S fcf < u ^ e A ° = A ° 1 


k =0 


oo 

E 

k =0 


\fc+l °° \k 

A " p(*) = E 


k\ 


k =0 


k\ 


The coefficient for the constant term in the left and right power series would 
thus differ (0 versus 1 ), in contradiction with the uniqueness theorem for 
power series. 

We provide an almost unbiased estimator for A' 1 by noting that the 
first jump time of a time-indexed Poisson process with intensity v is Exp(z/)- 
distributed and thus has expectation u~ 1 . Taking conditional expectation of 
the first jump time with respect to the value of the Poisson process at time 
1, we conclude that 


KNo, A 0 ) = 


(iVo + l)-\ 

1 + -^o \ 


for N 0 ^ 1, 
for N 0 = 0 


satisfies E[/z(iV 0 , A c ) |<17] = Ao 1 . Omitting the term A,, 1 , depending on A 0 , in 
the unlikely case N 0 = 0, we define our final estimator 


d= \C\ + 


N d 

N 0 + 1 



For the proofs, we also define the pseudo-estimator 

^ pseudo d= l ^ 1 + ^ |iV5 (wTi + ' 
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Theorem 4.3. The pseudo-estimator dpseudo is unbiased and the estimator 
d is asymptotically unbiased in the sense that with constants ci,C 2 > 0 
depending on d, d > 1, whenever A|C| ^ 1: 

0 ^ \C\ - E[tf] < Cl |C|exp(-c 2 (A|C'|) (d - 1)/(,i+1) ) , VC e C . 

Proof. We have 


E 


p 1 

-N 0 + l 



C 


= e 


-M - 1 


oo 

£ 

k =0 


A 0 fc+1 

(k + l)k\ 




which by |C|A 0 1 = A 1 and E [d orade ] — \C\ implies unbiasedness of d pseu d 0 ■ 
Thus, it follows that 

|C| - E[d] = E[|C7|A^e- Ao AE 1 ] = A _ 1 E[^ae“ A|e| ] . 

We exploit the deviation inequality from Thm. 1 in Brunei (2013) and derive 
the bound for the exponential moment of the missing volume in the model 
with fixed number of points 

E[exp (A|C \ C fc |)] ^ h exp (& 2 A|C|fc- 2 ^ +1) ), k ^ 2 , 


for positive constants bi, b 2 , depending on the dimension according to Brunei 
(2013). For the cases k — 0,1, we have the identity E[exp (A|C \ Ck\)] = 
exp (A|C|). By Poissonisation, similarly to (3.5), we derive 

exp(—A|C|)E[exp(A|C\C|)] 0 3 exp ( - c 2 (A|C|) (d - 1)/(d+1) ) , (4.2) 

for positive constants 63 , c 2 , depending on the dimension. Hence, using the 
Cauchy-Schwarz inequality and the bound for the moments of the points on 
the convex hull, 

E[iVj] = 0((A|C|) 9(,i ” 1)/(d+1) ) , qe N, (4.3) 

see e.g. Section 2.3.2 in Brunei (2014), we derive for a constant c\ > 0 

A" 1 E[AT a e- A|a| ] < A- 1 e“ A|c ' l E[PV|] 1 / 2 E[e 2A|c ' xa| ] 1/2 

^ c 1 A- 2 /( d+ 1 ) |C'|( rf -i ) / (rf - , -i) exp ( - c 2 (A|C|) (d-1)/(d+1) ) 

^ ci|C| exp ( - c 2 (A|C|) (d - 1)/(d+1) ) . (4.4) 
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□ 

The next step of the analysis is to compare the variance of the pseudo¬ 
estimator dp S eudo with the variance of the oracle estimator 'O or ade , which is 
UMVU. 

Theorem 4.4. The following oracle inequality holds with a universal con¬ 
stant c > 0 and dimension-dependent constants c±, C 2 > 0 for all C G C with 

A|C| > 1; 

Var {dpseudo) ^ (1 T ccr(A, C)) Var (f)oracle) T ?"(A, C ), 


where 


a(A, C) = (\ + Var( l C V C l) + E[\C \ C\] 

K \C\ VA E[|C\C|] 

r(A, C) = c 1 (A|C'|) 2(d - 1)/(d+1) exp ( - c 2 (A|C'|) (d - 1)/(d+1) ) . 
Proof. By the law of total variance, we obtain 

Var (dpseudo) = Vax(E\dp Seudo \C}) + E \yax{d pseudo \C)\ 

1 


= Var(d omde )+E (iY 0 |C|) 2 Var 


iVo + 1 


\C 


In view of N 0 \ C ~ Poiss(A 0 ), a power series expansion gives 

E[(iVo + 1) -2 | C] = A“ 1 e _Ao [ °(e t — l)/t dt. 

Jo 

The conditional variance can for A c —> oo thus be bounded by 
Var((l + No)' 1 ] C) < A“ 1 e _Ao J ° tf/tdt - (A 0 )" 2 + 0{e ~ Xo/4 ) 

=^\r<^-d d3+o(e - x ° M) 

= A 0 - 3 (l + o(l)), 

where we have used (A 0 — s ) _1 — A " 1 = sAj 1 (A 0 — s)” 1 , / 0 °° se~ s ds = 1 and 
dominated convergence. Thanks to (N 0 + l )^ 1 G [0,1] we conclude for some 
constant c ^ 1 

Var((l + 1V 0 ) _1 | C) ^ c(l A A" 3 ). 
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Consequently, we have 

Var 0 pseudo ) < Var(^ orade ) +E[(iV 9 |a|) 2 c(l A (A|C|)- 3 )] 

= Var(^ orade ) + cE[(N a \C\) 2 A A- 3 (iV 9 ) 2 |C|- 1 ] , 

and with (3.3) 

Var(V^) t t EftJVaAlgl^AtJVal^AICI)- 1 ] 

Var(A-ade) AE[|C \ C?|] 

, , E[(Ar s ) 2 ((A|A |) 2 A (AlAl)- 1 )] 

= 1 + c -Ipvy-' (4 ' 5) 

Define the‘good’event Q = {\C\ > |C|/2} , on which ((A|C |) 2 A (A|C|) _1 ) < 
2(A|C'|) _1 . On the complement Q c , we infer from A 2 A A -1 ^ 1 for A > 0 


E 


(Af 8 ) 2 ((A|C |) 2 A (AICI)- 1 )!^ < E[JV|le.] 


< E[Ar|] 1/2 p(|C\C| > C|/ 2 )V 2 

< c 1 (A|C|) 2 (‘ , - 1 V( i + 1 ) exp ( - c 2 (A|C|)<‘‘- I V(‘ i + 1 )), 


(4.6) 


for some positive constant C\ and C 2 , using (4.2) and (4.3). It remains to 
estimate the upper bound (4.5) on Q 


2c E[iVg] 2c c Var(iV 8 ) 
A|C|E[JV 8 ] A|C|V E[JV„] 1 s J 


(4.7) 


Using the identity (17) in Beermann and Reitzner (2015) for the factorial 
moments for the number of vertices Ng , we derive Vax(Ng) ^ A 2 Var(|(7 \ 
C|) + AE[|C \ C\] in view of E [N d \ = AE[|C \ C |]. Thus, (4.7) is bounded by 


2c E[7VJ] 
A|C|E[iV a ] 


< 2 c /I Var(|C\C|) 

^ \C\\X E[|C\C|] 


+ E[|C\C|]) 


which yields the assertion. 

As a result, we obtain an oracle inequality for the estimator $. 


n 
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Theorem 4.5. It follows for the risk of the estimator d for all C G C 
whenever A|C ^ 1; 

E[(£ - IC'D 2 ] 172 < (1 + ca(A, C))E[(d orade - IC'D 2 ] 1 / 2 + r(A, C ), 

with constant c > 0 and a(A, C), r(A, (7) /rom Theorem 4-4- For any C G C 
and X > 0 we have a(X,C) ^ 1 + ^yCy. 

Proof. In view of A 0 = A|C|, we have D = D pseu do — X~ 1 Nge~ x ^ and we 
derive as in (4.4) and (4.6) with some constants ci,c 2 > 0 

E[(tf- dpseuao ) 2 ] C A -2 E[iV|] 1 / 2 E[e- 4A l 77 '] 1 / 2 Y c 2 exp ( - 2c 2 (A|C'|) (d ~ 1)/(d+1) ) • 

To establish the oracle inequality, we apply the triangle inequality in L 2 -norm 
together with Theorems 3.2 and 4.4. 

The universal bound on a(A,C) follows from the rough bound E[|C' \ 
C| 2 ] ^ |C'|E[|C'\C i |]. P 

Note that the remainder term r(X,C) is exponentially small in A|C*|. 
Therefore, an immediate implication of Theorem 4.5 is that asymptotically 
our estimator t? is minimax rate-optimal in all dimensions, where the lower 
bound is proved in the next section. Yet, even more is true: the oracle in¬ 
equality is in all well studied cases exact in the sense that a (A, C) —» 0 holds 
for A —> oo such that the the UMVU risk of doracie is attained asymptotically. 

Lemma 4.6. We have tighter bounds on a(A ,C) from Theorem 4-4 the 
following cases: 

1. for d — 1,2 and C G C arbitrary: a(X,C) < (A|C*|) _2 / ( ' d+1 ^, 

2. for d ^ 2 , C with C 2 -boundary of positive curvature: cu(X,C) < 

(A|C|)“ 2/(d+1) , 

3. for d ^ 2 and C a polytope: a(A,C) < A _1 (log(A|C|)) d_1 ■ 

Proof. Let us restrict to |C| = 1, the case of general volume follows by 
rescaling. In view of the expectation upper bound (3.5), the main issue is 
to bound Var(|C* \ C'|)/E[|C' \ C\] uniformly. Case (1) follows from Pardon 
(2011), where A Var(|C \ C |) ~ E[|C \ C|] is established. 
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Figure 2: The two convex sets (blue), observations (points), their convex hulls 
(black lines) and dilated convex hulls (black dashed lines). 


For case (2) with smooth boundary, the upper bound for the variance, 
Vai(\C\C\) < \~( d +3)/(d+i) , was obtained in Reitzner (2005), while the lower 
bound for the first moment, E[|C \ Cj] > A~ 2 / (d+1 ), is due to Schiitt (1994). 

For the case (3) of polytopes, the upper bound Var(|C \ C |) < 

A _2 (log A)'^ 1 was obtained in Barany and Reitzner (2010), while the lower 
bound for the first moment, E[|C\C|] > A _1 (log A) d_1 , was proved in Barany 
and Larman (1988). The expectation upper bound from Remark 3.3 thus 
yields the result. □ 

One could conjecture that A Var(|C\C|) ~ E[|C\Cj] holds universally for 
all convex sets in arbitrary dimensions and thus that the oracle inequality is 
always exact. Proving such a universal bound is a challenging open problem 
in stochastic geometry, strongly connected to the discussion on universal 
variance asymptotics in terms of the floating body by Barany and Reitzner 
( 2010 ). 

5 Finite sample behaviour and dilated hull 
estimator 

In this section, we demonstrate the performance of the main estimator d 
numerically and compare it to other estimators including the naive estimator 
C'|, the naive oracle estimator N/ A , the UMVU oracle estimator 'd orac ie and 
the plug-in MLE estimator $ p iugi n — |Cj(l + Nq/N) . The main competitor 
from the literature is a rate-optimal estimator proposed in Gayraud (1997). 
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Figure 3: Monte Carlo RMSE estimates for the studied estimators for the 
volume of two convex sets: a polygon and an ellipse. 


In their construction, the whole sample is divided into three equal parts 
A", X' and X" of sizes N* (without loss of generality N * € N) and the 
estimator is given by 


IC 1,1 1 

«a=\c l + W £!(*;«! C), 

2—1 

where C" is the convex hull of the third sample X" . The data points are 
simulated for two convex sets: an ellipse and a polygon; see Figure 2. 

The RMSE estimate normalised by the area of the true set is based on 
M = 500 Monte Carlo iterations in each case. The results of the simulations 
are depicted in Figure 3 where n = A \C\ denotes the expected total number 
of points. The worst convergence rate of N/ A is clearly visible. More impor¬ 
tantly, we see that the RMSE of $ approaches the oracle risk for larger n (i.e. 
A) as the oracle inequality predicts. It is also conspicuous that in the stud¬ 
ied cases the plug-in estimator d p i ug i n and the estimator d perform rather 
similarly. This is explained by the fact that the number of points Nq on the 
convex hull increases with a moderate speed in the two-dimensional case, 
E[IVg] = (^(A 1 / 3 ), which results in a small difference between the multipli¬ 
cation factors Ng/N and Ng/(N a + 1). The simulations in two dimensions 
were implemented using the R package “spatstat” by Baddeley and Turner 
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(2005). To illustrate the sub-optimality of the plug-in estimator id p i U g in in 
high dimensions, we provide results of numerical simulations in dimensions 
d = 3,4, 5, 6 for the case when the true set C is a unit cube C = [0, l] d , see 
Figure 5. The simulations were implemented using the R package “geometry” 
by Habel, Grasman, Stahel, and Sterratt (2014). 

As an application of the obtained results, we propose a new estimator for 
the convex set itself: 


C = ^o+(^- 


+ d) 7 (®~®o) 


- po + 


y \c\ J 

N + 1 \ l/d 
N a + 1 


xeC 


(x - x 0 ) 


X e C , 


which is just the dilation of the convex hull C from its barycentre Xq, see 
the dashed polygons in Figure 2. Since the convex hull is a sufficient statis¬ 
tic (for known A), the points in its interior do not bear any information 
on the shape of C itself such that the barycentre is a reasonable choice. 
There are, of course, other enlargements of the convex hull conceivable like 
argmin BgC | B | = ^dn(B,C ), the convex set closest (in Hausdorff distance) to 

C with volume id. The intuition behind these estimators is based on the ob¬ 
servation that once the volume of the true set is known, we can estimate the 
set itself faster (in the constant), and id is a reasonable substitute for the 
true volume due to its fast rate of convergence. 

A detailed analysis is not pursued here, but in a small simulation study 
we investigate the behaviour of the new dilated hull estimator for the above 
polygon. The error ratio E[|C'Z\C'|]/E[|C'/AC'|] in terms of the symmetric 
difference AAB = (A\B)U(B\A) is approximated in M = 500 Monte Carlo 
iterations and shown in Figure 4. It turns out that the dilation significantly 
improves the convex hull as an estimator for C , especially for a small number 
of observations. 


6 Appendix 

6.1 Proof of Theorem 3.1 

The proof is split into several statements, which might be of interest on their 
own. 
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Figure 4: Monte Carlo error ratio for the convex hull and its dilation when 
the true set is a polygon. 


Lemma 6.1. The random variable is measurable with respect to T^. 

for any stopping set K . 

Proof. The proof is just a generalisation of the analogous statement for time- 
indexed stochastic processes, see e.g. Proposition 2.18 in Karatzas and Shreve 
(2012). For this, the notions are extended to the partial order C and then 
the right-continuity of (A f(K), K G K) (with respect to inclusion) implies its 
progressive measurability and thus in turn the measurability of M{Kf ). □ 

Next, observe that the set-indexed process (A f(K), K G K) has indepen¬ 
dent increments, i.e. for K \...., K m G K with K, C K l+i , i — 1,..., m — 1, 
the random variables J\f(K i+1 ) —Af(Ki) = J\f(K i+ i\Kf) are independent (by 
the independence of the PPP on disjoint sets). In fact, we show in Proposi¬ 
tion 6.2 that the process A f is even a strong Markov process. In addition, 

Proposition 6.2 yields (3.1) using that the closed complement /C = C c of the 
convex hull is a stopping set. 

Proposition 6.2. The set-indexed process ( J\f(K),K G K) is strong Markov 
at every stopping set /C . More precisely, conditionally on Tn the process 
(A f(K \ /C), K G K) is a Poisson point process with intensity A on KP. In 
particular, M{K \ /C) | ~ Poiss(X\K \ /C|) holds for all K G K . 

Remark 6.3. The fact that the increments Af(ICUK) — A/"(/C) are indepen¬ 
dent of Tjc can be derived from a general theorem about the strong Markov 
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property for random fields in Thm. f in Rozanov (1982). See also Zuyev 
(2006) for a discussion of the strong Markov property and its applications in 
stochastic geometry. These statements, however, do not provide a distribu¬ 
tional characterisation of the increments of the process. 

Proof. A set-indexed, (Tk) -adapted integrable process (Xk,K G K) is 
called a martingale if E[Xb| Ta\ = Xa holds for any A, B G K with A C B . 

By the independence of increments, the process Mk = M(K) — X\K\ , K G 
K , is clearly a martingale with respect to its natural filtration (J-'k, K gK). 
Then also the process 

M k = M KuK -M k = M(K \ K) - A| K \ K\ 

— def 

is a martingale with respect to the filtration JFk = Tk V Tk = Tkok 
because for K { , K 2 G K with A\ C K 2 the optional sampling theorem (see 
e.g. Zuyev (1999)) yields 


= M[Mk 2 uk ~ MjcITkxuk] — M Ki uk — Mv = M Kl , 

noting that K± U /C is again a stopping set. 

This implies that X\K\K ,\, conditionally on /C , is the deterministic com¬ 
pensator of the process N K = J\f(K \ /C). Then, due to the martingale char¬ 
acterisation of the set-indexed Poisson process, see Thm. 3.1 in Ivanoff and 
Merzbach (1994) (analogue of Watanabe’s characterisation for the Poisson 
process), the process Nk , conditionally on J-'k. , is a Poisson point process 
with mean measure J1(A) = A|A D /C c | . D 

The last statement of Theorem 3.1, that J-p- = cr(C) is shown next. It 
can be seen as a generalisation of the interesting fact that for a time-indexed 
Poisson process the sigma-algebra cr(r) associated with the first jump time 
r coincides with the sigma-algebra of r -history T r . 

Lemma 6.4. The sigma-algebra er(C) coincides with the sigma-algebra Tp 
of K, -history, i.e. cr(C) = Tp . 

Proof. Since /C is Ty-measurable by Lemma 1 in Zuyev (1999) and C = JC c , it 
is evident that cr(C) C Tp . The other direction is more involved. We use that 
the sigma-algebra Tp coincides with the sigma-algebra a({N(X fl K),K G 
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K}) generated by the process stopped at 1C. This statement can be derived 
from Thnr. 6, Ch. 1 in Shiryaev and Aries (2007). Note that their assumption 
(1.11) is satisfied in our case, because for all K £ K and u> £ i? there is u/ 
such that A f(U fl K, uj) = Af(U, u /) for all U £ K . which simply says that 
observing points in K £ K there might be no points outside K. Finally, 

observe that by definition of the convex hull J\f(C c fl K) = J\f((dC) fl K). 
Modulo null sets, A F((dC) DK ) counts the number of vertices of C in K and 
is thus (r(C')-nieasurable. □ 

Proof of Lemma f.l. Using that the bias of the oracle estimator d = \C\ + 
Nq/(N 0 +1) \C\ is exponentially small, it remains to compare its expectation 
with the expectation of the plug-in estimator i!}pi U gi n to show (4.1): 


E[# - $ plugin] = E 


\c\( 


N; 


d 


N 0 + l 


Nq, 
N ' 


= E 


\c\- 


N$ - N d 


> 


d 


-E 


\C\Nl 


d + 1 (N a + 1)2A|C| 


(N 0 + 1)(N 0 + N a ) 
1(N ^ 2X\C\) 


where in the last line we have used \C\ > 0 only if Nq ^ d + 1 and in this 
case Nq — Nq^ yfyA^J. Using E[(A^ + l) -1 | C] = A7 1 (l — e _Ao ) from above, 
we obtain after writing 1 (N < 2A|C|) = 1 - 1(N > 2X\C\) 


E[t? - fl plugin] A 


d 




d+ 1 
d 

d+ 1 


E 


N 2 d \C\(l 


Ao 


E 


Nl \C\ 

M 1( " >2A|C|) 


2A 0 A|C| 

E[AT|(1 — e _Ao )] E[A^ 2 1(A^ > 2A|U|)] \ 


2A 2 |C| 


2A 


■ 


By Cauchy-Schwarz inequality and large deviations similarly to (4.4), the 
first term is bounded from below by a constant multiple ofE[|C\C|] 2 /|C| 
in view of E[A^|] A A 2 E[|C \ C|] 2 , see e.g. Section 2.3.2 in Brunei (2014). 
Because of A^ ~ Poiss(A|C|), the second term is of order A|C| 2 e -A l c l and thus 
asymptotically of much smaller order. □ 
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Figure 5: Monte Carlo RMSE estimates for the studied estimators for the 
volume of the unit cube C = [0, l] d in dimensions d = 3,4,5, 6 . 
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